Multi-GPU Computing for Achieving Speedup in Real-time Aggregate Risk Analysis
نویسنده
چکیده
Stochastic simulation techniques employed for portfolio risk analysis, often referred to as Aggregate Risk Analysis, can benefit from exploiting state-of-the-art highperformance computing platforms. In this paper, we propose parallel methods to speedup aggregate risk analysis for supporting real-time pricing. To achieve this an algorithm for analysing aggregate risk is proposed and implemented in C and OpenMP for multi-core CPUs and in C and CUDA for many-core GPUs. An evaluation of the performance of the algorithm indicates that GPUs offer a feasible alternative solution over traditional high-performance computing systems. An aggregate simulation on a multi-GPU of 1 million trials with 1000 catastrophic events per trial on a typical exposure set and contract structure is performed in less than 5 seconds. The key result is that the multi-GPU implementation of the algorithm presented in this paper is approximately 77x times faster than the traditional counterpart and can be used in real-time pricing scenarios. Keywords-GPU computing; high-performance computing; aggregate risk analysis; catastrophe event risk; real-time pricing
منابع مشابه
An approach to Improve Particle Swarm Optimization Algorithm Using CUDA
The time consumption in solving computationally heavy problems has always been a concern for computer programmers. Due to simplicity of its implementation, the PSO (Particle Swarm Optimization) is a suitable meta-heuristic algorithm for solving computationally heavy problems. However, despite the simplicity, the algorithm is inefficient for solving real computationally heavy problems but the pr...
متن کاملA real-time GPU implementation of the SIFT algorithm for large-scale video analysis tasks
The SIFT algorithm is one of the most popular feature extraction methods and therefore widely used in all sort of video analysis tasks like instance search and duplicate/ near-duplicate detection. We present an efficient GPU implementation of the SIFT descriptor extraction algorithm using CUDA. The major steps of the algorithm are presented and for each step we describe how to efficiently paral...
متن کاملImportance of Explicit Vectorization for CPU and GPU Software Performance
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, are less frequently discussed. In this paper, we present an analysis of several optimizations done on both central processing unit (CPU) and GPU imp...
متن کاملA Distributed Multi-GPU System for Fast Graph Processing
We present Lux, a distributed multi-GPU system that achieves fast graph processing by exploiting the aggregate memory bandwidth of multiple GPUs and taking advantage of locality in the memory hierarchy of multi-GPU clusters. Lux provides two execution models that optimize algorithmic efficiency and enable important GPU optimizations, respectively. Lux also uses a novel dynamic load balancing st...
متن کاملReal-time 3D Video Processing Using Multi-stream GPU Parallel Computing
This work presents a real-time video processing algorithm for 3D scenes using a graphics processor. The processing is based on parallel computing using concurrent kernels. The proposed algorithm processes individual pixels of each pair of input stereo images to obtain an anaglyph image for each frame. To reduce the computational time, a concurrent kernel implementation using POSIX threads and C...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013